1 Method

This section provides additional information concerning the Method. For the general design, please refer to the main text.

If you want to replicate the experiment, you will find the code for the experiment and also a file (How to conduct the experiment.docx) explaining steps by step how to proceed in the folder Program_Experiment in my github: mathjoss/ExpCommunityVariation.

1.1 Participants

56 participants participated in this study (38 women and 18 men). They were between the ages of 18 and 50 (M= 24, SD=5 ). Please refer to Participants’ characteristics to see more about the distribution of age, gender, and other measures.

1.2 Setup

The setup is composed of four tables (labeled “A,” “B,” “C,” and “D”), each hosting a computer. Tables “A” and “C” are facing each other, and tables “B” and “D” too. Curtains are placed between the pair of table AC and BD to prevent the pairs of individuals sitting at one block of table from observing those at another. Stickers are affixed to the computer keyboards to indicate which keys participants can press. Following each round, two out of the four participants switch tables, taking their computers with them. The designated table to which they need to go is specified both in the computer program and on a piece of paper attached to the computer. The laptops are placed on a rotating device, which makes it easier for the participant to rotate their screen. When they arrived to the room, the first thing they do is to read and sign the consent form. Participants were paid 21 euros for their participation.

Illustration 2. Visual representation of the procedure. Please note that there also two more tables. See below for more information.
Illustration 1. The experimental design. Participants were sitting face to face on the chairs.

1.3 Stimuli

Stimuli were squares of size 300*300 pixel when being presented the 8 together, and 400*400 pixel the rest of the time.

1.4 Procedure

Here are presented the instructions for each part.

  • Passive exposure: Dadelijk zie je een aantal afbeeldingen een voor een op het scherm verschijnen samen met het woord in fantasietaal dat die afbeelding beschrijft. Probeer zo goed mogelijk het juiste woord bij elke afbeelding te onthouden.

  • First Testing (Round 0): Nu is het tijd om te testen hoe goed je de fantasietaal onthoudt! U ziet dezelfde afbeeldingen. Denk goed na over hoe je ze een naam zou geven en druk op enter als je klaar bent om te typen (tijdens het typen kun je de scene niet meer zien). Maak je geen zorgen als je de naam van de afbeelding niet meer weet en beproef je geluk door een woord te typen.

  • Communication game, Guesser: Je partner heeft een afbeelding gezien en een woord getypt om het te beschrijven. Lees het woord op de computer van je partner en kies de juiste afbeelding uit de 8 mogelijke afbeeldingen (gebruik 1-2-3-4-5-6-7-8 op het toetsenbord om een keuze te maken). Je krijgt feedback over je keuze, probeer er van te leren. Maak tijdens het experiment zo min mogelijk fouten! Vergeet niet om de feedback aan je partner te laten zien. Je krijgt feedback over je keuze, probeer daarvan te leren

  • Communication game, Producer: Nu krijg je een afbeelding te zien. Denk na over hoe je het zou noemen en druk op Enter als je klaar bent om te typen. Als je klaar bent met het schrijven van het woord, druk je nogmaals op enter en draai je de computer zodat je partner het woord kan lezen en kan raden welke afbeelding je beschrijft. Tijdens deze taak mag je de taal niet wijzigen naar Nederlands of andere bestaande talen. Gebruik ook niets dat heel erg op het Nederlands of andere talen lijkt. Ook mag je geen Nederlandse afkortingen, afkortingen of acroniemen gebruiken. Let op, je kunt niet alle letters gebruiken: alleen de letters die zichtbaar zijn op het toetsenbord kunnen worden gebruikt.

  • Testing (Round 10): Nu hoef je niet meer met je partner te communiceren. U moet een reeks afbeeldingen een naam geven in de nieuwe taal die u gebruikte.

1.5 Pre-Tests

When we first designed the experiment, participants had greater exposure to the initial labels (2 exposure of 8 seconds each, compared to 1 exposure of 7 second in the final design). During the pre-test, we observed very high recall of these labels. We found that participants showed no flexibility when they were too well-trained and tended to stick to the initial labels.

We made slight adjustments to the design to reduce participants’ memory of the labels, allowing us to observe flexibility and language change.

It is worth noting that an experimental design with no exposure to initial labels is also possible, although it would introduce additional constraints in terms of experiment management.

We also conducted another pre-test where the experimenter made an error when asking participants to switch seats. As a result, we excluded this data from the analysis. [Note: the results from this group were consistent with our hypothesis, showing adaptation to the biased participant]

1.6 Measures

All measures are described within the results.

1.7 Additional tasks

The study included three additional tasks.

1.7.1 Prosociality scale.

This questionnaire consists of 16 items that measure different aspects of prosocial behavior such as empathic concern, altruism, and volunteering using a 5-point Likert scale ranging from 1 (never) to 5 (very often).

English version:

The following statements describe a large number of common situations. There are no right or wrong answers; the best answer is the immediate, spontaneous one. Read each phrase carefully and fill in the number that reflects your first reaction.

The possible answers were: Never/Almost Never,Rarely,Occasionally,Often,Always/Almost Always

  • I am pleased to help my friends/colleagues in their activities
  • I share the things that I have with my friends
  • I try to help others.
  • I am available for volunteer activities to help those who are in need
  • I am empathic with those who are in need
  • I help immediately those who are in need
  • I do what I can to help others avoid getting into trouble.
  • I intensely feel what others feel
  • I am willing to make my knowledge and abilities available to others
  • I try to console those who are sad
  • I easily lend money or other things
  • I easily put myself in the shoes of those who are in discomfort
  • I try to be close to and take care of those who are in need
  • I easily share with friends any good opportunity that comes to me
  • I spend time with those friends who feel lonely
  • I immediately sense my friends’ discomfort even when it is not directly communicated to me

This was translated into Dutch from English.

Dutch version:

De volgende uitspraken beschrijven een groot aantal veelvoorkomende situaties. Er zijn geen goede of foute antwoorden; het beste antwoord is het onmiddellijke, spontane antwoord. Lees elke zin aandachtig en vul het nummer in dat uw eerste reactie weergeeft. Vergeet niet dat deze gegevens volledig anoniem zijn. Gebruik 1-2-3-4-5 op het toetsenbord om een keuze te maken.

Nooit/bijna nooit, Zelden, Af en toe, Vaak, Altijd/bijna altijd

  • Ik help graag mijn vrienden/collega’s bij hun activiteiten.
  • Ik deel de dingen die ik heb met mijn vrienden.
  • Ik probeer anderen te helpen.
  • Ik ben beschikbaar voor vrijwilligersactiviteiten om mensen in nood te helpen.
  • Ik ben empathisch voor degenen die in nood zijn.
  • Ik help onmiddellijk degenen die in nood zijn.
  • Ik doe wat ik kan om anderen te helpen voorkomen dat ze in de problemen komen.
  • Ik voel intens wat anderen voelen
  • Ik ben bereid mijn kennis en vaardigheden beschikbaar te stellen aan anderen.
  • Ik probeer degenen die verdrietig zijn te troosten.
  • Ik leen gemakkelijk geld of andere dingen uit.
  • Ik plaats mezelf gemakkelijk in de schoenen van degenen die zich ongemakkelijk voelen.
  • Ik probeer dichtbij te zijn en te zorgen voor degenen die in nood zijn
  • Ik deel gemakkelijk elke goede kans die ik krijg met vrienden.
  • Ik breng tijd door met die vrienden die zich eenzaam voelen
  • Ik voel het ongemak van mijn vrienden onmiddellijk, zelfs als het niet rechtstreeks aan mij wordt meegedeeld.

1.7.2 Dictator Game

This version of the Dictator Game is adapted in the context of the experiment. We included this task as another way to measure prosociality. Participants were presented with the following fictional situation:

English version:

Imagine that I give you an additional amount of 100 euros due to the excellent performance of your group during this experiment. Now, you have the choice to keep the full amount for yourself or to share it with the other participants. Since the other participants are not aware of this extra reward, the choice is entirely up to you. How much do you decide to share with the other participants?

They had the choice between:

  • keeping everything for them (1)
  • sharing but keeping more for them (2)
  • equally splitting (3)
  • sharing but keeping less (4)
  • giving everything (5)

Dutch version:

Stel je voor dat ik je een extra bedrag van 100 euro geef vanwege de uitstekende prestaties van jouw groep tijdens dit experiment. Nu heb je de keuze om het volledige bedrag voor jezelf te houden, of het te delen met de andere deelnemers. Aangezien de andere deelnemers niet op de hoogte zijn van deze extra beloning, is de keuze geheel aan jou. Hoeveel besluit je te delen met de andere deelnemers?

  • (1): 0 euro (Ik houd alle 100 euro)
  • (2): Tussen 0 en 75 euro (Ik deel, maar houd meer voor mezelf)
  • (3): 75 euro (25 euro voor iedereen)
  • (4): Tussen 75 en 100 euro (Ik houd minder dan hen)
  • (5): 100 euro (Ik geef alles weg)

1.7.3 Task-switching experiment

The task-switching experiment adapted from Roger and Monsell’s paradigm. An online version is available to try at this address: https://www.psytoolkit.org/experiment-library/taskswitching.html

The code of the task-switching experiment and the prosociality experiment are all available in the same file as for the code of the main experiment (see my github: mathjoss/ExpCommunityVariation).

Illustration 2. Instructions for the task-switching experiment as presented to the participant. The front line shows the “letter task” (press Q for consonants and P for vowels) and the bottom line shows the instructions for the “number task” (press Q for odd numbers and P for even numbers). The participant is first trained only with the first row (the combination of letter/number alternatively switch from the left to the right), then with the number task only in the bottom. After, the participant performed both conditions (letter task and number task) at the same time (the combination of letter number/number successively switch from up left, to up right, to bottom right, to bottom left.)
Illustration 2. Instructions for the task-switching experiment as presented to the participant. The front line shows the “letter task” (press Q for consonants and P for vowels) and the bottom line shows the instructions for the “number task” (press Q for odd numbers and P for even numbers). The participant is first trained only with the first row (the combination of letter/number alternatively switch from the left to the right), then with the number task only in the bottom. After, the participant performed both conditions (letter task and number task) at the same time (the combination of letter number/number successively switch from up left, to up right, to bottom right, to bottom left.)

2 Data

2.1 Terminology & typographic conventions

Throughout the results, we will use the following terminology:

  • Group: when using the terminology “for each group”, I will mean for each group of 4 that came to the lab. In the R data, we used several variables: GroupNum refers to the number of the group (e.g, 1, 2, 3, 4, 5, 6, 7), GroupType refers to the type of the group (control versus homogenous), and GroupID is the combination of the two (1_Control, 2_Control…, 1_Hetero.., 7_Hetero). When saying “for each group”, I implicitely imply “for each GroupID”.

  • Initial labels refers to the 8 initial words used to describe aliens presented in the passive exposure phase (aike, nusa, …)

  • Unbiased participants are participants 2, 3, and 4 (who can produce all letters) and biased participant refers to participant 1 (who cannot produce a and k). This is consistent across all groups.

  • Unbiased letters designate letters who can be produced by all participants (p, s, n, e, i, u) and biased letters designate the 2 letters that cannot be produced by the biased participants (k, a)

Typographic conventions:

  • variable names is represented using italic text

  • emphasis is represented using bold text

  • software, programming concepts, or files/folders' name (e.g., applications, packages or function names) are represented using fixed font text

2.2 Read files

We proceed and clean the files using the following methods:

After each group passation, we use CleanUpFiles.R (see folder InputFiles of my github folder github: mathjoss/ExpCommunityVariation) to perform the following steps:

  1. Data. merge the dataset obtained from the 4 participants into 1 single file using the R script

  2. Prosociality. For each participant, we compute the total prosociality score by additionning the results on each question

  3. Inverse efficiency. For each participant, we compute the inverse efficiency, where inverse efficiency is defined as \(mean(time)/mean(accuracy)\)

  4. Cognitive Flexibility. We measure each participant’s cognitive flexibility by examining their performance in a task that involves both numbers and letters. To do this, we calculate two average times: one for when the participant does not switch tasks (moving between letters or numbers) and another for when they do switch tasks (moving from numbers to letters or vice versa). The difference between these two averages represents their cognitive flexibility. We further normalize this difference by dividing it by the mean time required for task switching.

The “data” files are named after the group number and type, and it is saved as a .csv file. For instance, Group1HT.csv represents the data file for Group 1 Heterogenous.

The “other” measures (prosociality, inverse efficiency, cognitive flexibility) are summarized in a separate data file starting with Other_Group…, which includes the group number and type. This file also contains information about age, gender, and results in the dictator game.

We merge datasets from all groups, thus obtaining two datasets:

  • the dataset with data from all groups:
 GroupNum   GroupType               TypeTest    Producer Guesser    
 1:832    Control:2912   ComGame        :4032   1:1456   1   :1008  
 2:832    Hetero :2912   FirstTesting   : 448   2:1456   2   :1008  
 3:832                   PassiveExposure: 896   3:1456   3   :1008  
 4:832                   SecondTesting  : 448   4:1456   4   :1008  
 5:832                                                   NA's:1792  
 6:832                                                              
 7:832                                                              
     Round            Shape           ACC             Word          
 Min.   : 0.000   aike   : 728   Min.   :0.0000   Length:5824       
 1st Qu.: 1.000   anap   : 728   1st Qu.:0.0000   Class :character  
 Median : 4.000   esip   : 728   Median :1.0000   Mode  :character  
 Mean   : 4.231   kesip  : 728   Mean   :0.6772                     
 3rd Qu.: 7.000   nekuki : 728   3rd Qu.:1.0000                     
 Max.   :10.000   nus    : 728   Max.   :1.0000                     
                  (Other):1456   NA's   :1344                       
     ID_CG           index          pair           GroupID       ProducSim     
 Min.   :0.000   Min.   :  1.00   1_2 : 672   1_Control: 416   Min.   :0.0000  
 1st Qu.:0.000   1st Qu.: 28.75   1_3 : 672   1_Hetero : 416   1st Qu.:0.5000  
 Median :2.000   Median : 80.50   1_4 : 672   2_Control: 416   Median :1.0000  
 Mean   :2.423   Mean   : 82.96   2_3 : 672   2_Hetero : 416   Mean   :0.7482  
 3rd Qu.:5.000   3rd Qu.:132.25   2_4 : 672   3_Control: 416   3rd Qu.:1.0000  
 Max.   :7.000   Max.   :176.00   3_4 : 672   3_Hetero : 416   Max.   :1.0000  
                                  NA's:1792   (Other)  :3328                   
  • the dataset with other information from all groups:
     PartID        GroupNum     prosoc       DictatorGame        Age       
 Min.   :1.00   Min.   :1   Min.   :1.812   Min.   :1.000   Min.   :18.00  
 1st Qu.:1.75   1st Qu.:2   1st Qu.:3.500   1st Qu.:3.000   1st Qu.:20.00  
 Median :2.50   Median :4   Median :3.812   Median :3.000   Median :23.00  
 Mean   :2.50   Mean   :4   Mean   :3.714   Mean   :2.786   Mean   :24.02  
 3rd Qu.:3.25   3rd Qu.:6   3rd Qu.:4.062   3rd Qu.:3.000   3rd Qu.:27.00  
 Max.   :4.00   Max.   :7   Max.   :4.438   Max.   :5.000   Max.   :37.00  
    Gender            WorkingMem      difference      CogFlexibility   
 Length:56          Min.   :1.061   Min.   :-1.0784   Min.   :-0.6577  
 Class :character   1st Qu.:1.301   1st Qu.: 0.3999   1st Qu.: 0.2424  
 Mode  :character   Median :1.551   Median : 0.5361   Median : 0.3271  
                    Mean   :1.695   Mean   : 0.6125   Mean   : 0.3164  
                    3rd Qu.:1.863   3rd Qu.: 0.8278   3rd Qu.: 0.4237  
                    Max.   :3.324   Max.   : 2.0339   Max.   : 0.6110  
  GroupType        
 Length:56         
 Class :character  
 Mode  :character  
                   
                   
                   

2.3 Raw outputs

We look at the productions during the FirstTesting and the last testing (called here SecondTesting). As a reminder, FirstTesting occurs after the passive exposure, and SecondTesting occurs after the communication game.

2.3.1 Heterogenous groups

Group 1:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 uine unip esip nesi nesipi nus nuse inus
2 suka nuki pasip nusip pikak nup nuki nukik
3 aike akin sepik kipe nekuki nus nusa peki
4 paik piak esup nenusa nekuki nus nusa piak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 pine enup esip nenuse nepipi nus nuse puni
2 paik esup esip nanusip nukeki nuk suka ekip
3 paik kusip esip nenusa nenuki nus nusa nuki
4 paik sanip esip nenusa nenuka nus nusa senip

Group 2:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipe unup esip nesip nepupi esu nusu puei
2 aike upak esip kesip nusa nusa nusa puak
3 anap anap esip kesip nekuki nus nusa puak
4 aike kusap esi kesi sup nus nuk nuki

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipe unup esip nesip nepupi nus nusu pue
2 aike anap pua kesip kukupa nus nusa pua
3 aike anap esip kesip nekuki nus nusa pua
4 aike anap esip kesip kukupa nus nusa pua

Group 3:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 ie ee nus en neiu nus nus pu
2 aike anap kesip enis nukaki nus esip nesip
3 enik anak naku kesin enik nus nusa nekuki
4 aike anap esip kesip nekuki nuak nusa nuak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 ie nus esip esip nuui nus nus pu
2 aike anap esip uiuie nukaki nus nusp iesie
3 suki nupa esip kesip nekuki nus nusa nekiki
4 aike anap esip kesip nekuki nus nusa nuak

Group 4:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 ie np nesis esis neui nus ns pien
2 puak asap esip kesip puak nup nuki puak
3 aike enus sup esun kuapa esip enus kaup
4 aike anap nusa aise nepik nus anap puak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 inp enep uinus inus nepupi nus ienies pieu
2 aike asap pieu kessip nepupi nup inus puak
3 aike esep unus enus nekip nus inus kaup
4 aike anap kisa kessip nekip nus nusa puak

Group 5:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 ie np esip eip nus nus pu
2 una esip una sunua nukaki esip kika esep
3 aike nesip esip nesip nuas kapu nuap nakuku
4 aike euki esip apak nekuki nup asep isap

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 ie np esip esipp nusinu nupi nus inis
2 aike np epik sesss kasaki usa nup usa
3 aike supi esip sess nakuki nup nus np
4 aike np esip esipppp nukaki nuk ss epik

Group 6:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 esei enep nuis senui nus nusi nesupi
2 aike anap esip kesip nekuki nun nusa puak
3 aike anap esip sekin nekuki sap nusa puak
4 peki ani sunak nekaki nusaki nuk nekaki nusak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipi enep epui puise pepupi nuse nus epei
2 aike anap seki kesip nekuki nan nusa puak
3 aike anap seki esip nekuki nak nusa puak
4 kuap ani nekaki epi nesaki suk peike kuap

Group 7:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 in n pui nu nu nus ui n
2 aike anap esip kapu nekuki asap nusa naku
3 aike anas esip nesip nekuki nus nusa puki
4 enke apa ekip esnik a ip ana pun

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipe nesu epin pine neipe nus nusu sinu
2 aike anas nekip pinu nekuki nus nusa senu
3 aike anas nukip pinu nuipe nus nusa seni
4 pike anas enpik enpiku nipuki nus nusa pine

2.3.2 Control groups

Group 1:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike akap esip kesip nekuki nue isip ekup
2 aike usap aise pasu nekaki pasu nasu nesu
3 aike puak puak kesip nekuki nus nusa anani
4 aike anap esip kesip nakuki nus nusa pusa

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nekuki nuk nusa puak
2 aike anu espi kesip nekuki nuk nusa puak
3 aike anap esip kesip nekuki nus nusa puak
4 aike anap esip kesip nekuki nus nusa puak

Group 2:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike puak nepik nekip pekuke nus puek ekip
2 aupi anaap neuik kesip nekuki nus nus auki
3 aike apap esip kepip nekuke nunu anuna esip
4 aike enuik espin nesu pinak nus nusa puak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike asap peuk kesip nekuki nus nuna kapi
2 ausi asap peuk kesip nekuki nus nuna ausi
3 aike asap puki kesip nekuki nus nuna puik
4 aike asap sik puik akunu nus nuna suipe

Group 3:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nukuki nus nusa pinuk
2 aine anep esip kesip anaki nus kanip anap
3 aike anapi nuas neki nekuki nus nuas puas
4 aspe anak enis nusa nekuki epis suki pua

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike nap esip kesip nukuki nus nusa pusip
2 aike anasp esip kesip nekuki nus nusa puaki
3 aike anasp esip kesip nekuki nus nusa puaki
4 aike anasp esip kesip nekuki nus nusa puaki

Group 4:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap puka kesip nekuki nus nusa esip
2 aike anak esip kesip nukiki nus nusa puak
3 asip anap esip keniki keniki nupu pasu esik
4 puaki anap nuk kuni pukaki nus nusa puak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nekiki nus nusa puak
2 aike anap esip kesip nanuki nus nusa puak
3 aike anap esip kesip kanuki nus unsa puka
4 aike anap esip kesip nekiki nus nusa puak

Group 5:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nekuki nus nusa kuak
2 kesi anap pusa kesip kununa nap nepi kupi
3 aike asap esip kesip nekusi nus enis pnua
4 aike anak esip esip nekuki nus nusa puak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nekuki nus nusa puak
2 nues napa esip kesip nekuki nus nusa puak
3 aike anap esip kesip nekuki nus nusa puak
4 aike anap esip kesip nekuki nus nusa puak

Group 6:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike enap esip kesip nekuki nus nusa puak
2 esai punke esip enku nekuki nusa nusa akik
3 uas anap puki sekip nekuki nas auke suki
4 aike kesip aike kesip kusseni esip nusa puak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike enap esip kesip nekuki nus nusa ipuk
2 aike enap esip kesip nekuki nus nasu paku
3 aike enap esip kesip nekuki nus nusa paku
4 aike enap esip kesip nekuki nus nusa puka

Group 7:

Before training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike asap esip keisa nekuki nus nusa aike
2 aike anap esip kesip nekuki nus nusa puak
3 aike nus nusip kuap nekuki nusa nesip kuap
4 aike anap esip kesip nekuki nus nusa kuak

After training:

PartID aike anap esip kesip nekuki nus nusa puak
1 aike anap esip kesip nekuki nus nusa puas
2 aike asap esip kesip nekuki nus nusa puak
3 aike anap esip kesip nekuki nus nusa kuap
4 aike anap esip kesip nekuki nus nusa puak

3 Results

3.1 0 - Understanding the results

3.1.1 Learning Accuracy

Here, we examine how well the participants remembered the initial labels for the images, looking at the labels written by the participants during the FirstTesting.

Accuracy is binary: a value of 1 indicates correct recall by the participant, while a value of 0 signifies the presence of at least one error. However, we made an exception for the biased participant. In their case, their responses were considered “correct” if they substituted the biased letter they were unable to produce with another letter, or if they omitted it. Please note that for the biased participant in the heterogeneous groups, we manually coded the accuracy, only for the pre-communication (round 0) testing moment. Indeed, we only look at this measure in this context.

3.1.2 Initial production similarity

Participants were presented with the following initial labels during the passive exposure phase: kesip, esip, nusa, nus, aike, puak, nekuki, and anap. Our analysis focuses on the differences between participants’ first productions (during FirstTesting) and the initial labels. To do this, we computed the normalized Levenshtein distance between the initial labels and the initial productions. Please note that we tried to find a way to compute an exception for the biased participant, in a similar way as we did for the learning accuracy. However, we could not find a solution that would not “advantage” or “disandvantage” the biased participant. Thus, when plotting the results from the initial production similarity, we removed the biased participant, in order to have two sets of data which are comparable.

**Figure 1.** Learning success (accuracy on the left, initial production similarity on the right) per group type. Dots indicate each participant. The red dot shows the mean of the group, and red point range shows the standard error.

Figure 1. Learning success (accuracy on the left, initial production similarity on the right) per group type. Dots indicate each participant. The red dot shows the mean of the group, and red point range shows the standard error.

Participants remembered in average 47.1% of the initial labels, with a standard deviation of 27.18. The average production similarity with the initial labels is of 0.67 with a standard-deviation of 0.2.

We check whether the difference of learning is statistically different. This is not an hypothesis, just a sanity check that our two group types are similar.

With accuracy learning:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_acc ~ GroupType + (1 | GroupID)
   Data: df_agg_acc

REML criterion at convergence: 515.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.57025 -0.73141 -0.06612  0.85951  1.78513 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept)   0.0     0.00   
 Residual             729.5    27.01   
Number of obs: 56, groups:  GroupID, 14

Fixed effects:
                Estimate Std. Error     df t value Pr(>|t|)    
(Intercept)       51.786      5.104 54.000  10.146 4.09e-14 ***
GroupTypeHetero   -9.375      7.218 54.000  -1.299      0.2    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
GroupTypHtr -0.707
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

With initial production similarity: (please note that here, we look at a model that include all productions from the control groups, but only production from participant 2, 3, and 4 in the heterogenous groups, since the biased compared cannot be directly compared to them).

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_lev_dis ~ GroupType + (1 | GroupID)
   Data: df_agg_dist2

REML criterion at convergence: -12.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.3659 -0.7941 -0.1304  0.9421  1.7428 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.00000  0.0000  
 Residual             0.03911  0.1978  
Number of obs: 49, groups:  GroupID, 14

Fixed effects:
                Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)      0.70704    0.03737 47.00000  18.919   <2e-16 ***
GroupTypeHetero -0.09335    0.05709 47.00000  -1.635    0.109    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
GroupTypHtr -0.655
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

-> The learning difference between the two groups is not significant

Looking at each shape:

**Figure 2.** Levenshtein distance between initial words and initial productions for each word. Blue dots show biased participants.

Figure 2. Levenshtein distance between initial words and initial productions for each word. Blue dots show biased participants.

3.1.3 Participants’ characteristics

**Figure 3.** Distribution of the scores. Red shows the results for the control group, blue shows the results for the heterogenous group.

Figure 3. Distribution of the scores. Red shows the results for the control group, blue shows the results for the heterogenous group.

When building t-tests comparing each of these characteristics for the groups, we obtain the following p-values (please note that we used a chisq test for gender):

  • For prosociality: 0.66
  • For working memory: 0.55
  • For cognitive flexibility: 0.81
  • For dictator game: 1
  • For age: 0.41
  • For gender: 0.77
  • For accuracy learning: 0.2
  • For distance to initial words: 0.02

None of these values are significant, except for the distance to initial words. This can be easily explained given that the method we used to measure the initial production similarities is different for biased and unbiased participants. Thus, if we remove the biased participants, this p-value is not-significant anymore: 0.3.

3.1.4 Strategy

In this part, we only look at the patterns observed in the heterogenous groups.

What are the strategies used by the biased participant to produce the initial labels before communicating? In this table, we count the number of times each biased participant has performed:

  • removal means that the biased participant remembered the label, and decided to remove the biased letter

  • switch means that the biased participant remembered the label, and decided to switch the biased letter with another unbiased letter

  • forgot means that the biased participant has forgot the initial label

removal forgot switch
Group1 0 1 5
Group2 0 0 6
Group3 4 2 0
Group4 5 1 0
Group5 5 1 0
Group6 0 4 2
Group7 0 6 0

Please note that these tables were filled manually.

Then, we also look at the pattern presented by the biased participant after communicating. How did this biased participant produce the label? Did this participant:

  • removal: used the initial label, but removed the biased letter

  • switch: used the initial label, but switched the biased letter with another unbiased letter (please note that in the case of “nekuki” label, we consider as “switch” things like “nepupi” or “nepipi”; it is not exactly the same word so it should be considered in the “new” column, but we think that the strategy is actually a switch (+ a weak forgot))

  • new: adopted a new label

  • special: here to condition where the participant has removed the biased letter, but has also added a letter at the end of the word. Then, the word was identifiable by the other members of the groups by the addition of this final letter (for example, esipp, and other participants produced esippp, sesss, sessss).

removal new switch special
Group1 0 1 5 0
Group2 0 6 0 0
Group3 5 0 1 0
Group4 0 2 4 0
Group5 3 1 1 1
Group6 1 3 2 0
Group7 0 4 2 0

These tables concerned only the biased participants. We can also look at the productions of the unbiased participants after they communicated (during the final testing) to look at their adaptative strategy. Please note that we only look here at the labels featuring one or two biased letters (all labels but nus and esip). Here are the following possible options:

  • SameBiased_Removed: used the same label as the biased participant, in the case where the biased participant has removed a biased letter

  • SameBiased_Removed: used the same label as the biased participant, in the case where the biased participant has switched a biased letter with another unbiased letter

  • SameBiased_New: used the same label as the biased participant, in the case where the biased participant has created a new label

  • InitialLabel: used the initial label, even if the biased participant cannot use it the same way

  • DiffBiased_Adapt: use a different label from the biased participant, however this new label does not feature biased letter

  • DiffBiased_ABitAdapt: use a different label from the biased participant, this new label features slightly less biased letters than the original label (namely, instead of 2 biased letters, there is only one biased letter)

  • DiffBiased_NonAdapt: use a different label from the biased participant and from the initial label, featuring biased letters

SameBiased_Removed SameBiased_Switch SameBiased_New InitialLabel DiffBiased_Adapt DiffBiased_ABitAdapt DiffBiased_NonAdapt
Group1 0 0 0 2 2 5 9
Group2 0 0 0 14 0 3 2
Group3 0 0 0 10 2 3 3
Group4 0 1 0 7 4 2 4
Group5 3 0 0 3 7 2 3
Group6 0 0 0 11 0 2 5
Group7 0 0 0 6 6 2 4

When adapating, there are three main types of strategy used by the unbiased participants to adapt to the biased participants:

  • either they use the same word as the biased participant (it has happened 4 times out of the 6 words containing a biased letter and the 21 unbiased participants -> total of 126 occurences).

  • either they use a new label, not used by the biased participants, but that does not feature any biased letters: 21 or that does feature less biased letters than the original label: 19

If they do not adapt, either by using the initial label that contains as many biased letters as the original label (or more): 83

3.1.5 Summary results

  • learning performance (both using accuracy index and distance to initial learning) are similar between control and heterogenous groups

  • some words are easier to remember compared to other (such as aike or nus)

  • the participant characteristics are similar between control and heterogenous groups (in average, they have approximately the same age, but also personal characteristics such as working memory, cognitive flexibility…)

  • in order to compensate for their difference, biased participants either remove the biased letter(s) or switch them to another letter. After communicating, they also come up with new words

  • unbiased participants adapt mostly by using another label which does not feature any biased letter (or less frequently), and sometimes they adapt by copying the label used by the biased participant

3.2 1 - How did language evolve in the heterogeneous groups?

In this part, we will find the plots and the models that are referred in the main paper as:

3.2.1 Communicative success

Here, we examine the level of communicative success in interactions, specifically distinguishing between successful (success=1) and unsuccessful (success= 0) interaction in pairs. Please note that the variables Round2 here is the reverse order of Round (so that Round 9 becomes Round 0).

We look at the evolution of communicative success for each group:

**Figure 4.** Evolution of communicative success for each group.

Figure 4. Evolution of communicative success for each group.

We can see that the communicative success of group 2 in the heterogeneous group is particularly high.

Let’s look at the aggregated performance for each group type:

**Figure 5.** Evolution of communicative success aggregated by group type: control or heterogenous.

Figure 5. Evolution of communicative success aggregated by group type: control or heterogenous.

These plots raise two questions:

  • Is there a significant improvement of communicative success with time?

  • Is there a significant difference of communicative success between heterogeneous and control groups?

To investigate these questions, we build a linear mixed-effect models using the group type (hetero versus control) and the round as fixed effect. We use the aggregated data by pair, and we control for the random effect of Group Number.

Model 1:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_acc ~ GroupType * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: 2082.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.3675 -0.5828 -0.0625  0.7277  3.4084 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 149.1    12.21   
 Residual             208.4    14.43   
Number of obs: 252, groups:  GroupID, 14

Fixed effects:
                       Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)            102.3512     5.1882  16.4845  19.728 6.75e-13 ***
GroupTypeHetero        -27.8571     7.3373  16.4845  -3.797  0.00151 ** 
Round2                  -5.1265     0.4980 236.0000 -10.293  < 2e-16 ***
GroupTypeHetero:Round2   1.0491     0.7043 236.0000   1.489  0.13770    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH Round2
GroupTypHtr -0.707              
Round2      -0.384  0.272       
GrpTypHt:R2  0.272 -0.384 -0.707

We also used a model in which data is not aggregated by pair. In this new model, we also added a random effect for pair. The results of the model are very similar, but show even stronger effects.

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: ACC ~ GroupType * Round2 + (1 | GroupID) + (1 | pair)
   Data: df2

     AIC      BIC   logLik deviance df.resid 
  4061.5   4099.3  -2024.8   4049.5     4026 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-9.1003 -0.7776  0.2886  0.6434  1.8459 

Random effects:
 Groups  Name        Variance Std.Dev.
 GroupID (Intercept) 0.47217  0.6871  
 pair    (Intercept) 0.01215  0.1102  
Number of obs: 4032, groups:  GroupID, 14; pair, 6

Fixed effects:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)             3.70442    0.31978  11.584  < 2e-16 ***
GroupTypeHetero        -2.54209    0.41995  -6.053 1.42e-09 ***
Round2                 -0.43406    0.03037 -14.295  < 2e-16 ***
GroupTypeHetero:Round2  0.24810    0.03568   6.953 3.57e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH Round2
GroupTypHtr -0.744              
Round2      -0.526  0.396       
GrpTypHt:R2  0.442 -0.439 -0.837

There is a significant improvement in performance with time in both group type (control and heterogenous), as shown by the main effect of the Round2 variable. A lower performance is observed in the heterogenous groups than in the control group, as shown by the main effect of the GroupType variable depsite the very high performance reached by heterogenous Group 2. Moreover, no interaction effect between group type and round is present.

From this analysis, it is clear that the average communicative success is lower in heterogeneous groups compared to control groups. However, it could be due to the fact that heterogeneous groups include interactions with a biased participants. Thus, it is interesting to split the heterogeneous groups into two categories:

  • Hetero_biased (pairs in the heterogeneous groups involving the biased participant: 1-2, 1-3, 1-4)

  • Hetero_nonbiased (pairs in the heterogeneous groups that do not involve the biased participant: 2-3, 2-4, 3-4).

This could lead to two different scenarios:

  1. Hetero_nonbiased has high communicative success similarly to control groups, because these pairs do not include the biased participants ;

  2. The confusion introduced by the biased participant spread to all participants, and the performance of hetero_unbiased is still lower than for control groups.

**Figure 6.** Same as above, except that here we split heterogenous groups in 2: pairs including the biased participant and pairs without the biased participant.

Figure 6. Same as above, except that here we split heterogenous groups in 2: pairs including the biased participant and pairs without the biased participant.

The plot suggests that hypothesis 2 is supported: the presence of the biased participant has introduced confusion within the heterogeneous group, leading to a decrease in the communicative success even in interactions between unbiased participants. However, in general, the pair with the biased individual seems to achieve an even lower accuracy score (in the heterogenous group). We used a mixed-effect model to see if this is statistically significant (this refers to the model 1 bis of the main paper:)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_acc ~ type * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: 2060.2

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.4762 -0.6021 -0.0308  0.7018  3.1542 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 149.8    12.24   
 Residual             195.5    13.98   
Number of obs: 252, groups:  GroupID, 14

Fixed effects:
                                 Estimate Std. Error       df t value Pr(>|t|)
(Intercept)                      102.3512     5.1646  16.1891  19.818 8.84e-13
typeHetero With Biased           -32.8274     7.6565  19.4992  -4.288 0.000378
typeHetero Without Biased        -22.8869     7.6565  19.4992  -2.989 0.007391
Round2                            -5.1265     0.4825 234.0000 -10.626  < 2e-16
typeHetero With Biased:Round2      0.9896     0.8356 234.0000   1.184 0.237530
typeHetero Without Biased:Round2   1.1086     0.8356 234.0000   1.327 0.185909
                                    
(Intercept)                      ***
typeHetero With Biased           ***
typeHetero Without Biased        ** 
Round2                           ***
typeHetero With Biased:Round2       
typeHetero Without Biased:Round2    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
                   (Intr) typHtrWthBs typHtrWthtB Round2 typeHtrWthBsd:Rnd2
typHtrWthBs        -0.675                                                  
typHtrWthtB        -0.675  0.820                                           
Round2             -0.374  0.252       0.252                               
typeHtrWthBsd:Rnd2  0.216 -0.437      -0.146      -0.577                   
typHtrWthtBsd:Rnd2  0.216 -0.146      -0.437      -0.577  0.333            

Let’s just compare the pairs with the biased participant to the pairs without. Note that pairs with the biased participants are considered to be the pairs involving participant 1 even in the control groups. This allows us to check that there is no difference between these “sham” biased pairs and the other pairs in the control groups.

**Figure 7.** Mean communicative success for each group, whether unbiased communicate with biased participant (part 1) or unbiased participant (part 2, 3, 4) Bars show standard error.

Figure 7. Mean communicative success for each group, whether unbiased communicate with biased participant (part 1) or unbiased participant (part 2, 3, 4) Bars show standard error.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_acc ~ GroupType * have_biased + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: 2194.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.6009 -0.6198  0.1994  0.7254  2.5621 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 141.6    11.90   
 Residual             343.6    18.54   
Number of obs: 252, groups:  GroupID, 14

Fixed effects:
                               Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)                      79.464      5.067  15.010  15.682 1.02e-10 ***
GroupTypeHetero                 -16.071      7.166  15.010  -2.243  0.04044 *  
have_biasedyes                    4.762      3.303 236.000   1.442  0.15066    
GroupTypeHetero:have_biasedyes  -15.179      4.671 236.000  -3.250  0.00132 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH hv_bsd
GroupTypHtr -0.707              
have_bisdys -0.326  0.230       
GrpTypHtr:_  0.230 -0.326 -0.707

We also run an extra analysis on the evolution of communicative success, to observe how it evolves within a round. Indeed, as participants are only confronted one time with each image, one could suppose that the communicative success would increase within a round, with an elimination process.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: ACC ~ InteractionNum * GroupType + (1 | Producer) + (1 | GroupID) +  
    (1 | Round)
   Data: df_agg

REML criterion at convergence: 41453.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.5698 -0.9597  0.2247  0.7487  1.8828 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept)  154.809 12.442  
 Round    (Intercept)  160.782 12.680  
 Producer (Intercept)    2.637  1.624  
 Residual             1679.405 40.981  
Number of obs: 4032, groups:  GroupID, 14; Round, 9; Producer, 4

Fixed effects:
                                Estimate Std. Error        df t value Pr(>|t|)
(Intercept)                      77.3409     6.6563   22.8132  11.619 4.65e-11
InteractionNum                    0.5299     0.1980 4006.6871   2.676  0.00748
GroupTypeHetero                 -21.9841     7.1807   15.1444  -3.062  0.00784
InteractionNum:GroupTypeHetero   -0.1972     0.2800 4005.0007  -0.704  0.48120
                                  
(Intercept)                    ***
InteractionNum                 ** 
GroupTypeHetero                ** 
InteractionNum:GroupTypeHetero    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) IntrcN GrpTyH
InteractnNm -0.253              
GroupTypHtr -0.539  0.234       
IntrctN:GTH  0.179 -0.707 -0.331

Summary results:

  • Communicative success improves with time (for both groups)

  • Communicative success is higher in control groups

  • In heterogeneous groups, communicative success is higher in pairs that do not contain the biased individuals. However, the communicative of this type of pairs is still lower than the one in control groups, which suggests that introducing a biased participant has spread some confusion in the whole group.

3.2.2 Convergence

In this part, we look at the convergence between all words produced in a round. As a reminder, each round, each participant produce one word for each label. So each round, there are 4 word productions for each labels : convergence will be high if these words are similar (such as kesip, kesup, kesip and kesip) but convergence will be low if these words are very different from each other (for example, kesip, onup, asip and keku).

Convergence is computed the following way:

  1. Calculate the normalized Levenshtein distance between all pairs of words within the set of four words.

  2. Find the average of these distances to obtain a single numerical value for each Round, each Shape, and each Group.

  3. Take the complement of this value, so that the measure of convergence increases when the words are more similar, rather than the opposite.

In other words, \(convergence = 1 - (mean(dis(SetWords)))\) where dis(SetWords) is the pairwaise normalized Levenhstein distance between all words in SetWords.

The plot below includes the production of all participants, including the biased one:

**Figure 8.** Evolution of convergence with time for each group.

Figure 8. Evolution of convergence with time for each group.

We look at the same plot aggregated by group type:

First, we look at the convergence with all participants:

**Figure 9.** Same, but aggregated by group.

Figure 9. Same, but aggregated by group.

And we look at the model comparing control and heterogenous groups. This refers to the model 2 of the paper:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: All ~ GroupType * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -316.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7091 -0.6778  0.0682  0.6382  2.2009 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.010588 0.10290 
 Residual             0.004724 0.06873 
Number of obs: 154, groups:  GroupID, 14

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)              0.956567   0.041561  14.440812  23.016 8.70e-13 ***
GroupTypeHetero         -0.328186   0.058777  14.440812  -5.584 6.01e-05 ***
Round2                  -0.041340   0.002477 138.000000 -16.690  < 2e-16 ***
GroupTypeHetero:Round2   0.017065   0.003503 138.000000   4.872 2.99e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH Round2
GroupTypHtr -0.707              
Round2      -0.298  0.211       
GrpTypHt:R2  0.211 -0.298 -0.707

Then, we look at:

  • “Hetero With Biased”: convergence in heterogeneous groups with all participants (what was computed before; 4 data)
  • “Hetero Without Biased”: convergence in heterogenous groups excluding the biased participant (3 data)
  • “Control”: convergence in control groups (4 data)
**Figure 10.** Same as above, but distinguishing between hetero with biased and hetero without biased participants.

Figure 10. Same as above, but distinguishing between hetero with biased and hetero without biased participants.

And we observe a model based on this plot, comparing control, hetero with biased, hetero without biased. This refers to the model 2 bis of the main paper:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_lev_dist ~ TypeGroup * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -459.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.52680 -0.64233  0.08447  0.64197  2.18911 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.011642 0.10790 
 Residual             0.005428 0.07367 
Number of obs: 231, groups:  GroupID, 14

Fixed effects:
                                        Estimate Std. Error         df t value
(Intercept)                             0.956567   0.043703  14.918711  21.888
TypeGroupHetero With Biased            -0.328186   0.061805  14.918711  -5.310
TypeGroupHetero Without Biased         -0.254336   0.061805  14.918711  -4.115
Round2                                 -0.041340   0.002655 213.050677 -15.571
TypeGroupHetero With Biased:Round2      0.017065   0.003755 213.050677   4.545
TypeGroupHetero Without Biased:Round2   0.012387   0.003755 213.050677   3.299
                                      Pr(>|t|)    
(Intercept)                           9.45e-13 ***
TypeGroupHetero With Biased           8.90e-05 ***
TypeGroupHetero Without Biased        0.000927 ***
Round2                                 < 2e-16 ***
TypeGroupHetero With Biased:Round2    9.20e-06 ***
TypeGroupHetero Without Biased:Round2 0.001137 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
                      (Intr) TypGrpHtrWthBs TypGrpHtrWthtB Round2
TypGrpHtrWthBs        -0.707                                     
TypGrpHtrWthtB        -0.707  0.935                              
Round2                -0.304  0.215          0.215               
TypeGrpHtrWthBsd:Rnd2  0.215 -0.304         -0.152         -0.707
TypGrpHtrWthtBsd:Rnd2  0.215 -0.152         -0.304         -0.707
                      TypeGrpHtrWthBsd:Rnd2
TypGrpHtrWthBs                             
TypGrpHtrWthtB                             
Round2                                     
TypeGrpHtrWthBsd:Rnd2                      
TypGrpHtrWthtBsd:Rnd2  0.500               

3.2.3 Evolution of production similarity

This analysis helps us gain insight into our data. We want to know if participants eventually adopt the initial labels, even if they initially didn’t remember them. To find out, we calculate the average Levenshtein distance between participants’ productions and the initial labels at each round.

Please note that this measure goes against the biased participant, as even if they remember the initial labels correctly, they may not be able to reproduce the exact initial labels.

We look at the evolution of production similarity for each shape:

**Figure 11.** Evolution of the production similarity at each round, for each shape. Please note that this plot does not include the productions of the biased participant.

Figure 11. Evolution of the production similarity at each round, for each shape. Please note that this plot does not include the productions of the biased participant.

To see the production similarity by group:

**Figure 12.** Evolution of the production similarity at each round, for each shape. Please note that this plot does not include the productions of the biased participant.

Figure 12. Evolution of the production similarity at each round, for each shape. Please note that this plot does not include the productions of the biased participant.

We observe that certain shapes are more effectively remembered than others. For instance, “aike” and “nus” are often well-remembered, while “puak” tends to be frequently forgotten.

This plot also reveals that individuals in the control group often converge on the initial labels in the end, even if they initially forget it. However, in the heterogenous group, this convergence does not occur. Participants’ productions tend to become slightly closer to the initial labels, but in the end, the words still remain quite different, even for words that did not contain a biased letter (nus and esip)!

We look at the same plot aggregated by group type. Please note that the following plot is biased, because the biased participant could not produce the exact initial labels. Below, you will find a plot that is more suited to compare control and heterogenous groups.

**Figure 13.** Same plot as above, but aggregated by shape. Please note that this plot does not include the productions of the biased participant.

Figure 13. Same plot as above, but aggregated by shape. Please note that this plot does not include the productions of the biased participant.

Then, we apply a model to compare the two groups, which is refered to as model 3 in the main paper:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_lev_dis ~ GroupType * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -610.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.8445 -0.6126  0.1272  0.6344  3.3697 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.01113  0.1055  
 Residual             0.01937  0.1392  
Number of obs: 616, groups:  GroupID, 14

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)              0.934176   0.042537  14.389325  21.961 1.80e-12 ***
GroupTypeHetero         -0.304382   0.060157  14.389325  -5.060  0.00016 ***
Round2                  -0.023053   0.002507 600.000001  -9.193  < 2e-16 ***
GroupTypeHetero:Round2   0.014273   0.003546 600.000001   4.025 6.43e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH Round2
GroupTypHtr -0.707              
Round2      -0.295  0.208       
GrpTypHt:R2  0.208 -0.295 -0.707

Now, we split heterogenous condition in two: participants in heterogenous groups in pairs with (hetero_biased) and without (hetero_unbiased) the biased participant, similarly with previous plots. Is is a better measure since the data from the biased participant is artificially biased.

**Figure 14.** Same plot as above, except that here, we differentiate between pairs interacting with the biased individuals, and pairs interacting without the biased individual.

Figure 14. Same plot as above, except that here, we differentiate between pairs interacting with the biased individuals, and pairs interacting without the biased individual.

We can see that as expected, due to our measure of production similarity, the performance of the biased participant was dragging the whole group to lower similarity. However, we still can find differences between the control groups and the heterogenous groups containing only unbiased participants.

Let’s see if this difference is significant (model 3 bis in the main paper):

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_lev_dis ~ Condition * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -377

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.0087 -0.6323  0.0673  0.6161  3.4418 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.011787 0.10857 
 Residual             0.005042 0.07101 
Number of obs: 189, groups:  GroupID, 14

Fixed effects:
                                        Estimate Std. Error         df t value
(Intercept)                             0.953641   0.045431  16.869431  20.991
ConditionHetero With Biased            -0.424984   0.064249  16.869431  -6.615
ConditionHetero Without Biased         -0.273413   0.064249  16.869431  -4.255
Round2                                 -0.026245   0.003465 171.069028  -7.575
ConditionHetero With Biased:Round2      0.019480   0.004900 171.069028   3.975
ConditionHetero Without Biased:Round2   0.012143   0.004900 171.069028   2.478
                                      Pr(>|t|)    
(Intercept)                           1.58e-13 ***
ConditionHetero With Biased           4.56e-06 ***
ConditionHetero Without Biased        0.000542 ***
Round2                                2.16e-12 ***
ConditionHetero With Biased:Round2    0.000103 ***
ConditionHetero Without Biased:Round2 0.014178 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
                     (Intr) CndtnHtrWthBs CndtnHtrWthtB Round2
CndtnHtrWthBs        -0.707                                   
CndtnHtrWthtB        -0.707  0.908                            
Round2               -0.381  0.270         0.270              
CondtnHtrWthBsd:Rnd2  0.270 -0.381        -0.191        -0.707
CndtnHtrWthtBsd:Rnd2  0.270 -0.191        -0.381        -0.707
                     CondtnHtrWthBsd:Rnd2
CndtnHtrWthBs                            
CndtnHtrWthtB                            
Round2                                   
CondtnHtrWthBsd:Rnd2                     
CndtnHtrWthtBsd:Rnd2  0.500              

3.2.4 Production similarity with biased participant

Here, we look at a different index: how similar are the productions from the unbiased participants with the productions from the biased participants. We do this by computing the levenshtein distance between the productions from the unbiased, and the production of the biased used in this round.

We do this for both groups, in order to check, but please note that in control groups we note “Participant 1” (who is unbiased) as the biased participant.

**Figure 15.** Evolution of the production similarity of unbiased participants with the biased participants.

Figure 15. Evolution of the production similarity of unbiased participants with the biased participants.

**Figure 16.** Same as above, but plotting the linear regression going through the points.

Figure 16. Same as above, but plotting the linear regression going through the points.

We can see that the participant used terms more and more similar with the labels used by the biased participants. However, the comparison with control groups (in which the biased participant is just Participant 1, i.e. an unbiased participant), suggest that this is not specifically an adaptation to the biased participant, but probably more something related to the fact that participants converge, at least partly, on the initial labels.

3.2.5 Stability

Stability is a measure of the levenstein distances between all pairs of words from rounds n and n-1.

First, we look at the evolution of stability for each shape.

To better understand how the function works, let’s take an example for the shape kesip for one group:

  • At round 7, participants produced puise, kesip, esip, epi
  • At round 8, participants produced puie, suki, kesip, kesip

Stability is computed by computing the levenshtein distance between all pairs of words (puise and puie, then puise and suki, and so on…). In this case stability between round 7 and 8 for the shape kesip is equal to 0.59.

Since this value assumes that the data is computed between rounds n and n-1, it is normal that the plots shows the value for stability only from round 1 to round 10.

**Figure 17.** Evolution of stability with time for the group (including biased participants.

Figure 17. Evolution of stability with time for the group (including biased participants.

We look at the same type of data, except that it is aggregated by group type, similarly as what was performed before.

**Figure 18.** Evolution of stability with time for the group (excluding biased participants).

Figure 18. Evolution of stability with time for the group (excluding biased participants).

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: stab ~ GroupType * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -345.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.37624 -0.59170 -0.02896  0.61681  3.16096 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.009844 0.09922 
 Residual             0.002886 0.05372 
Number of obs: 140, groups:  GroupID, 14

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)              0.944215   0.039354  13.729821  23.993 1.32e-12 ***
GroupTypeHetero         -0.278455   0.055655  13.729821  -5.003 0.000205 ***
Round2                  -0.035742   0.002235 124.000000 -15.989  < 2e-16 ***
GroupTypeHetero:Round2   0.012773   0.003161 124.000000   4.040 9.30e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH Round2
GroupTypHtr -0.707              
Round2      -0.256  0.181       
GrpTypHt:R2  0.181 -0.256 -0.707
**Figure 19.** Evolution of stability aggregated with time for each type of group: everyone (all) or everyone except the biased participant (Without Biased).

Figure 19. Evolution of stability aggregated with time for each type of group: everyone (all) or everyone except the biased participant (Without Biased).

And a model:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Stab ~ TypeGroup * Round2 + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: -505.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1921 -0.5972  0.0185  0.6486  3.1799 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.010430 0.10213 
 Residual             0.003367 0.05803 
Number of obs: 210, groups:  GroupID, 14

Fixed effects:
                                  Estimate Std. Error         df t value
(Intercept)                       0.944215   0.040696  14.179876  23.202
TypeGroupHetero_All              -0.278455   0.057552  14.179876  -4.838
TypeGroupHetero_Unbiased         -0.211269   0.057552  14.179876  -3.671
Round2                           -0.035742   0.002415 192.043738 -14.801
TypeGroupHetero_All:Round2        0.012773   0.003415 192.043738   3.740
TypeGroupHetero_Unbiased:Round2   0.008902   0.003415 192.043738   2.607
                                Pr(>|t|)    
(Intercept)                     1.11e-12 ***
TypeGroupHetero_All             0.000254 ***
TypeGroupHetero_Unbiased        0.002470 ** 
Round2                           < 2e-16 ***
TypeGroupHetero_All:Round2      0.000243 ***
TypeGroupHetero_Unbiased:Round2 0.009860 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) TyGH_A TyGH_U Round2 TGH_A:
TypGrpHtr_A -0.707                            
TypGrpHtr_U -0.707  0.950                     
Round2      -0.267  0.189  0.189              
TypGrH_A:R2  0.189 -0.267 -0.134 -0.707       
TypGrH_U:R2  0.189 -0.134 -0.267 -0.707  0.500

3.3 2 - Has the language of the group adapted to the specificity of the biased participant?

In this part, we will look at a set of different models.

First, we will gather the data by group (of course, excluding the production from the biased participant in the heterogenous group). It leads to beautiful and interpretable plots (the ones printed in the main paper). However, we loose some information by averaging the data. In these models, we have only the Group ID as the random factor. Then, we will look at the data for each individuals. The plots are harder to read, but the statistics are more accurate. In the models, we look at the random effects of participants nested within group ID. For each type of data aggregation, we look both at the evolution of the frequency of biased letters across rounds (including round 0 and 10, which are the testing before and after the communication game), and we also make a special focus on the testing moment (first versus last). We tried to add the round or the moment of testing as random slopes, but it led the models not to converge.

To summarize our models, here is what we used:

Type aggregation Fixed effect 1 Fixed effect 2 Random effects Plot Model
Group Round Group Type Group ID Figure 2A model 4a.
Group Testing moment Group Type Group ID Figure 2B model 4b
Individual Round Group Type Group ID / part_ID not in the main paper model 4, in the main paper
Individual Testing moment Group Type Group ID / part_ID not in the main paper model 4c

3.3.1 Per group

3.3.1.1 For all rounds

Here, we calculate the frequency of biased and unbiased letters. For each round, we determine the total frequency of “k” and “a” out of all the letters used in that round to obtain the frequency of biased letters. Similarly, we compute the frequency of “p”, “n”, “s”, “e”, “i”, and “u” out of the total frequency of letters used in the round to obtain the frequency of unbiased letters. Since there are 6 unbiased letters and 2 biased letters, we divide the frequency of biased letters by 2 and the frequency of unbiased letters by 6.

Please note that the initial labels have slightly more biased letters than unbiased letters. The frequency of each letter in the initial labels (kesip, esip, nus, nusa, aike, puak, nekuki, anap) is:


a e i k n p s u 
5 4 4 5 4 4 4 4 

Thus, the initial frequency of each biased letter is of 29.41 %.

In all the following plots, we will represent these initial frequencies with a black dashed line.

**Figure 20.** Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

Figure 20. Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

Out of curiosity, let’s observe if we plot the same plot with a loess regression instead of a linear one:

**Figure 21.** Same as above, but using loess regression.

Figure 21. Same as above, but using loess regression.

Let’s use a linear model to study if there is an effect of time and group type on the frequency of biased and unbiased letters. The model include testing sessions (round 0 to 10), and is refered as model 4a.

Without random slopes for round number:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ Round2 * GroupType + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: 739.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6092 -0.5602  0.0435  0.5250  2.5411 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 12.369   3.517   
 Residual              5.398   2.323   
Number of obs: 154, groups:  GroupID, 14

Fixed effects:
                        Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)             28.90363    1.41858  14.38761  20.375 5.14e-12 ***
Round2                   0.00534    0.08373 138.00000   0.064   0.9492    
GroupTypeHetero         -5.21776    2.00618  14.38761  -2.601   0.0206 *  
Round2:GroupTypeHetero   0.48548    0.11841 138.00000   4.100 7.02e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Round2 GrpTyH
Round2      -0.295              
GroupTypHtr -0.707  0.209       
Rnd2:GrpTyH  0.209 -0.707 -0.295

With random slopes for round number:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ Round2 * GroupType + (1 + Round2 | GroupID)
   Data: df_agg
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 715.1

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1058 -0.4826  0.0038  0.4949  1.8808 

Random effects:
 Groups   Name        Variance Std.Dev. Corr 
 GroupID  (Intercept) 25.6494  5.0645        
          Round2       0.1215  0.3486   -0.92
 Residual              4.2363  2.0582        
Number of obs: 154, groups:  GroupID, 14

Fixed effects:
                       Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)            28.90363    1.96386 11.99993  14.718 4.83e-09 ***
Round2                  0.00534    0.15119 11.99994   0.035   0.9724    
GroupTypeHetero        -5.21776    2.77732 11.99993  -1.879   0.0848 .  
Round2:GroupTypeHetero  0.48548    0.21381 11.99994   2.271   0.0424 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Round2 GrpTyH
Round2      -0.873              
GroupTypHtr -0.707  0.617       
Rnd2:GrpTyH  0.617 -0.707 -0.873

The plot and model suggest:

  • There is more variation in heterogenous groups compared to control groups

  • In control groups, the proportion of biased and unbiased letters remains similar to the initial frequency of these letters

  • In heterogenous groups, the proportion of unbiased letters slightly increases with time, while the proportion of biased letters slightly decreases with time.

Let’s look at the same plot, but using aggregated values for all groups:

**Figure 22.** Same plot as above, aggregated by group number.

Figure 22. Same plot as above, aggregated by group number.

Interestingly, the frequency of biased letters dropped at round 3 in heterogenous groups. It could be due to the fact that unbiased participants have been paired with the biased participant successively in round 1 and 2, and thus have both in mind the vocabulary with less biased letters.

3.3.1.2 For Testing

The previous plot focused on the evolution of the frequency of biased and unbiased letters across all rounds (0 to 10). Now, we will focus solely on Round 0 and Round 10, namely, the initial (FirstTest, after the passive exposure) and the final testing (LastTest, after the communication game). Here too, we remove the data from the biased participant in the heterogenous groups.

**Figure 23.** Change in the frequency of biased and unbiased letters in the first testing (before the communication game) and in the second testing (after the communication game) at a group-level. Each point represent a group, and the thin grey line indicate the within design (each group is tested before and after).

Figure 23. Change in the frequency of biased and unbiased letters in the first testing (before the communication game) and in the second testing (after the communication game) at a group-level. Each point represent a group, and the thin grey line indicate the within design (each group is tested before and after).

We compute the model, which is refered as model 4b (see [2 - Group-adaptation to the biased participants]) for more information. Note that a model with a random slopes for TypeTest did not converge.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ TypeTest * GroupType + (1 | GroupID)
   Data: df_agg

REML criterion at convergence: 131

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.52681 -0.48572  0.05654  0.30750  2.49407 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 3.794    1.948   
 Residual             6.833    2.614   
Number of obs: 28, groups:  GroupID, 14

Fixed effects:
                                   Estimate Std. Error     df t value Pr(>|t|)
(Intercept)                          29.792      1.232 21.287  24.179   <2e-16
TypeTestLast Test                    -4.484      1.397 12.000  -3.209   0.0075
GroupTypeControl                     -1.068      1.742 21.287  -0.613   0.5464
TypeTestLast Test:GroupTypeControl    4.051      1.976 12.000   2.050   0.0629
                                      
(Intercept)                        ***
TypeTestLast Test                  ** 
GroupTypeControl                      
TypeTestLast Test:GroupTypeControl .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) TypTLT GrpTyC
TypTstLstTs -0.567              
GrpTypCntrl -0.707  0.401       
TypTsLT:GTC  0.401 -0.707 -0.567

This is not significant, but it could be due to the low number of groups. Thus, we also perform bootstrapping.

3.3.1.2.1 Boostrapping (Frequency of biased letters)

Bootstrapping is a resampling technique used in statistics to estimate the uncertainty associated with a sample statistic. It involves repeatedly drawing random samples with replacement from the original data set. By creating multiple bootstrap samples, the method allows for the estimation of sampling variability, constructing confidence intervals, and assessing the statistical significance of results. Bootstrapping is particularly useful when the sample size is small or when the underlying data distribution is unknown or non-normal, as it provides a robust and flexible approach for inference.

# A tibble: 12 × 6
   term                               estimate   lower upper type  level
   <chr>                                 <dbl>   <dbl> <dbl> <chr> <dbl>
 1 (Intercept)                           29.8  27.1    32.1  norm   0.95
 2 GroupTypeControl                      -1.07 -4.40    2.65 norm   0.95
 3 TypeTestLast Test                     -4.48 -6.89   -1.69 norm   0.95
 4 GroupTypeControl:TypeTestLast Test     4.05 -0.0332  8.02 norm   0.95
 5 (Intercept)                           29.8  27.4    32.2  basic  0.95
 6 GroupTypeControl                      -1.07 -4.25    2.21 basic  0.95
 7 TypeTestLast Test                     -4.48 -6.55   -2.02 basic  0.95
 8 GroupTypeControl:TypeTestLast Test     4.05  0.192   7.77 basic  0.95
 9 (Intercept)                           29.8  27.3    32.2  perc   0.95
10 GroupTypeControl                      -1.07 -4.35    2.11 perc   0.95
11 TypeTestLast Test                     -4.48 -6.95   -2.42 perc   0.95
12 GroupTypeControl:TypeTestLast Test     4.05  0.335   7.91 perc   0.95
**Figure 24.** Estimate of the confident intervals from the Bootstrapping technique. .

Figure 24. Estimate of the confident intervals from the Bootstrapping technique. .

When observing the lower and upper bound for the ineraction between GroupType and TypeTest, we found that this confidence interval never includes zero. It is hinting at the possible significance of the data if we have had more groups.

3.3.1.2.2 How to explain Group 2 and Group 6 performance?

@fig-freq-testing2 also highlights that two groups within the heterogenous condition (Group 2 and Group 6) deviate from this pattern. Our hypothesis is that these groups did not adapt to the biased participant because their participants remembered too well the initial labels, causing them to stick by those words. To further investigate, let’s examine the accuracy of the initial learning phase for all groups:

**Figure 25.** Investigating more the relation between adaptability and performance at learning the initial words. Here, we look at the initial accuracy (binary, 0 or 1) and the initial distance words (levenshtein distance) in the first testing for each group. We expect the performance to be better for Group 2 and Group 6.

Figure 25. Investigating more the relation between adaptability and performance at learning the initial words. Here, we look at the initial accuracy (binary, 0 or 1) and the initial distance words (levenshtein distance) in the first testing for each group. We expect the performance to be better for Group 2 and Group 6.

It appears that Group 2 and Group 6 (the groups that did not adapt to the biased participants) also exhibited higher learning accuracy. Let’s verify whether our hypothesis is encouraged by checking if these groups had nearly identical words at the end compared to the initial words.

For group 2:

Before communicating labels:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipe unup esip nesip nepupi esu nusu puei
2 aike upak esip kesip nusa nusa nusa puak
3 anap anap esip kesip nekuki nus nusa puak
4 aike kusap esi kesi sup nus nuk nuki

After communicating labels:

PartID aike anap esip kesip nekuki nus nusa puak
1 eipe unup esip nesip nepupi nus nusu pue
2 aike anap pua kesip kukupa nus nusa pua
3 aike anap esip kesip nekuki nus nusa pua
4 aike anap esip kesip kukupa nus nusa pua

For Group 6:

Before communicating labels:

PartID nusa nus kesip esip puak nekuki anap aike
1 esei enep nuis senui nus nusi nesupi
2 aike anap esip kesip nekuki nun nusa puak
3 aike anap esip sekin nekuki sap nusa puak
4 peki ani sunak nekaki nusaki nuk nekaki nusak

After communicating labels:

PartID nusa nus kesip esip puak nekuki anap aike
1 eipi enep epui puise pepupi nuse nus epei
2 aike anap seki kesip nekuki nan nusa puak
3 aike anap seki esip nekuki nak nusa puak
4 kuap ani nekaki epi nesaki suk peike kuap

Let’s see if this is significant:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_acc ~ GroupOK + (1 | GroupNum)
   Data: df_agg
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 249.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.17398 -0.79615  0.04701  0.73152  1.45716 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupNum (Intercept)   0.0     0.00   
 Residual             707.2    26.59   
Number of obs: 28, groups:  GroupNum, 7

Fixed effects:
                   Estimate Std. Error     df t value Pr(>|t|)    
(Intercept)          36.250      5.946 26.000   6.096 1.92e-06 ***
GroupOKDidNotAdapt   21.563     11.125 26.000   1.938   0.0635 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
GrpOKDdNtAd -0.535
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Call:
lm(formula = mean_acc ~ GroupOK, data = df_agg)

Residuals:
   Min     1Q Median     3Q    Max 
-57.81 -21.17   1.25  19.45  38.75 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)          36.250      5.946   6.096 1.92e-06 ***
GroupOKDidNotAdapt   21.563     11.125   1.938   0.0635 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 26.59 on 26 degrees of freedom
Multiple R-squared:  0.1263,    Adjusted R-squared:  0.09265 
F-statistic: 3.757 on 1 and 26 DF,  p-value: 0.06352

In order to look more precisely at the relation between learning accuracy and adaptability, please refer to part 3 - Who adapts and when? and subpart Predicting adaptation.

Now, we look at the evolution of the production similarity with the initial labels with time with a focus on these two groups:

**Figure 27.** Evolution of the production similarity at each round and for each group. Please note that this plot does not include the productions of the biased participant.

Figure 27. Evolution of the production similarity at each round and for each group. Please note that this plot does not include the productions of the biased participant.

This plot helps understand the non adaptability of Group 2 - participants of this group converged on the initial labels. However, no clear explanation emerges about the performance reached by Group 6.

3.3.2 Per individual

3.3.2.1 For all rounds

This is the same plot as before, except that we plot here the results for each participant. See the part [2 - Group-adaptation to the biased participants] for more information.

Please note that while we did not printed this plot in the main paper, the model associated to this plot is the one we put in the main paper.

**Figure 28.** Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

Figure 28. Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

**Figure 28.** Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

Figure 28. Plot showing the evolution of frequency of biased letters in control groups (red) and heterogenous groups (blue). Each line represent a group, and the thick line shows the linear regression applied to all these groups.

Let’s look at the model, included in the main paper and refered to as model 4. Please note that this model include testing sessions (round 0 to 10). We did not include the round as a random slope.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ Round2 * GroupType + (1 | GroupID/PartID_unique)
   Data: df_agg

REML criterion at convergence: 3054.5

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.6513 -0.5190 -0.0076  0.5139  4.2969 

Random effects:
 Groups                Name        Variance Std.Dev.
 PartID_unique:GroupID (Intercept)  4.541   2.131   
 GroupID               (Intercept) 10.630   3.260   
 Residual                          14.059   3.750   
Number of obs: 539, groups:  PartID_unique:GroupID, 49; GroupID, 14

Fixed effects:
                         Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)             28.889341   1.356657  12.697214  21.295 2.58e-11 ***
Round2                   0.008185   0.067563 488.000501   0.121   0.9036    
GroupTypeHetero         -5.176293   1.946370  13.455257  -2.659   0.0192 *  
Round2:GroupTypeHetero   0.469077   0.103204 488.000500   4.545 6.93e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Round2 GrpTyH
Round2      -0.249              
GroupTypHtr -0.697  0.174       
Rnd2:GrpTyH  0.163 -0.655 -0.265

Adding round or round type as random slopes causes the model not to converge:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ Round2 * GroupType + (1 + Round2 | GroupID/PartID_unique)
   Data: df_agg
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 3005.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9117 -0.4638  0.0175  0.4466  3.8943 

Random effects:
 Groups                Name        Variance Std.Dev. Corr 
 PartID_unique:GroupID (Intercept)  4.5471  2.1324        
                       Round2       0.1386  0.3723   -0.43
 GroupID               (Intercept) 24.0662  4.9057        
                       Round2       0.1013  0.3183   -1.00
 Residual                          11.7363  3.4258        
Number of obs: 539, groups:  PartID_unique:GroupID, 49; GroupID, 14

Fixed effects:
                        Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)            28.889341   1.932302 11.442407  14.951  7.3e-09 ***
Round2                  0.008185   0.152439 11.257713   0.054   0.9581    
GroupTypeHetero        -5.176293   2.750667 11.748949  -1.882   0.0849 .  
Round2:GroupTypeHetero  0.469077   0.222252 12.597970   2.111   0.0554 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Round2 GrpTyH
Round2      -0.864              
GroupTypHtr -0.702  0.607       
Rnd2:GrpTyH  0.592 -0.686 -0.849
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

3.3.2.2 For Testing

This plot is the same as in figure 18, except that we plot here the output for each participant.

**Figure 29.** Evolution of the mean frequency of biased  and unbiased letters for each individual in each group type.

Figure 29. Evolution of the mean frequency of biased and unbiased letters for each individual in each group type.

This model is refered as model 4c:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ TypeTest * GroupType + (1 | GroupID/PartID_unique)
   Data: df_freq

REML criterion at convergence: 553.6

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.79806 -0.65303 -0.00725  0.63814  2.11781 

Random effects:
 Groups                Name        Variance Std.Dev.
 PartID_unique:GroupID (Intercept)  4.131   2.033   
 GroupID               (Intercept)  3.296   1.815   
 Residual                          13.193   3.632   
Number of obs: 98, groups:  PartID_unique:GroupID, 49; GroupID, 14

Fixed effects:
                                  Estimate Std. Error      df t value Pr(>|t|)
(Intercept)                        28.7550     1.0438 15.1128  27.548 2.47e-14
TypeTestLast Test                  -0.4778     0.9707 47.0000  -0.492   0.6249
GroupTypeHetero                     1.0060     1.5444 18.1165   0.651   0.5230
TypeTestLast Test:GroupTypeHetero  -3.9701     1.4828 47.0000  -2.677   0.0102
                                     
(Intercept)                       ***
TypeTestLast Test                    
GroupTypeHetero                      
TypeTestLast Test:GroupTypeHetero *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) TypTLT GrpTyH
TypTstLstTs -0.465              
GroupTypHtr -0.676  0.314       
TypTsLT:GTH  0.304 -0.655 -0.480

Here too, adding random slopes for TypeTest causes the model not to converge.

3.3.3 Summary: per group & per pair

Using violin plots and standard errors

**Figure 30.** Plot showing the evolution of frequency of both biased letters (red) and unbiased letters (blue). Each point represents a participant, and its production at specific moment: with it lasts interaction with a unbiased participant (left panel, left side), and a biased participant (left panel, right side), and during its first testing before communication game (right panel, left side), and during the last testing after communication game (right panel, right side).

Figure 30. Plot showing the evolution of frequency of both biased letters (red) and unbiased letters (blue). Each point represents a participant, and its production at specific moment: with it lasts interaction with a unbiased participant (left panel, left side), and a biased participant (left panel, right side), and during its first testing before communication game (right panel, left side), and during the last testing after communication game (right panel, right side).

**Figure 30.** Plot showing the evolution of frequency of both biased letters (red) and unbiased letters (blue). Each point represents a participant, and its production at specific moment: with it lasts interaction with a unbiased participant (left panel, left side), and a biased participant (left panel, right side), and during its first testing before communication game (right panel, left side), and during the last testing after communication game (right panel, right side).

Figure 30. Plot showing the evolution of frequency of both biased letters (red) and unbiased letters (blue). Each point represents a participant, and its production at specific moment: with it lasts interaction with a unbiased participant (left panel, left side), and a biased participant (left panel, right side), and during its first testing before communication game (right panel, left side), and during the last testing after communication game (right panel, right side).

3.4 3 - Who adapts and when?

3.4.1 Pair-level adaptation

We focus here on the pair-level: we compare the productions of the participants in the pair excluding the biased participants (pair 2 - 3, 3 - 4, and 2 - 4) to the production used by pairs involving the biased participant (pair 1 - 2, 1 - 3, and 1 - 4). Important note: the numbers of words considered when omparing between pair without and with the biased participant differ, as we do not consider the productions of Participant 1. For example, in Round 1, we compare the output of:

  • Pair without the biased participant (Hetero_unbiased): production of participant 3 and participant 4 -> 16 words in total

  • Pair involving the biased participant (Hetero_biased): production of participant 2 only (we did not look at the frequency of biased letter for participant 1 because this participant is unable to produce any of these letters) -> 8 words in total

**Figure 31.** Change in the frequency of biased and unbiased letters in the first testing (before the communication game) and in the second testing (after the communication game) at a group-level. Each point represent a group, and the thin grey line indicate the within design (each group is tested before and after).

Figure 31. Change in the frequency of biased and unbiased letters in the first testing (before the communication game) and in the second testing (after the communication game) at a group-level. Each point represent a group, and the thin grey line indicate the within design (each group is tested before and after).

An interesting observation appears: in Round 3, pairs with unbiased individuals use less biased letters than the unbiased participants involved in the preceding rounds! One hypothesis is that these two unbiased participants have been paired with the biased participant in the preceding rounds and already start to adjust to them.

We merge these results over the rounds, to have a look at the global picture:

**Figure 32.** Evolution of the mean frequency of biased letters comparing two types of pair: pair including the biased participant (1-2, 1-3, and 1-4) and pair excluding the biased participant (2-3, 3-4, 2-4).

Figure 32. Evolution of the mean frequency of biased letters comparing two types of pair: pair including the biased participant (1-2, 1-3, and 1-4) and pair excluding the biased participant (2-3, 3-4, 2-4).

We build two models: one in which the group type is a fixed effect (but please note that this introduces a bias, since in control groups, there is no “talk to bias” or “talk to unbias” - instead, it reflects whether participants 2, 3, and 4 talk to participant1 or not).

This is not the model 5 refered in the main paper, since we also have here the control groups:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ GroupType * TalkToBiased + (1 | GroupID/PartID_unique)
   Data: df_agg

REML criterion at convergence: 309.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.26916 -0.38120 -0.03873  0.32515  2.28763 

Random effects:
 Groups                Name        Variance Std.Dev.
 PartID_unique:GroupID (Intercept) 1.013    1.006   
 GroupID               (Intercept) 2.902    1.704   
 Residual                          1.064    1.031   
Number of obs: 84, groups:  PartID_unique:GroupID, 42; GroupID, 14

Fixed effects:
                                Estimate Std. Error      df t value Pr(>|t|)
(Intercept)                      14.5818     0.7166 13.2669  20.350 2.16e-11
GroupTypeHetero                  -1.0773     1.0134 13.2669  -1.063 0.306723
TalkToBiasedYes                  -0.1710     0.3183 40.0000  -0.537 0.594076
GroupTypeHetero:TalkToBiasedYes  -1.6846     0.4501 40.0000  -3.742 0.000573
                                   
(Intercept)                     ***
GroupTypeHetero                    
TalkToBiasedYes                    
GroupTypeHetero:TalkToBiasedYes ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) GrpTyH TlkTBY
GroupTypHtr -0.707              
TalkToBsdYs -0.222  0.157       
GrpTyH:TTBY  0.157 -0.222 -0.707

Then, we also look only at heterogeneous groups. This is the model 5 refered in the main paper:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Freq ~ TalkToBiased + (1 | GroupID/PartID_unique)
   Data: df_agg

REML criterion at convergence: 171.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.91303 -0.32334 -0.08062  0.29827  1.78595 

Random effects:
 Groups                Name        Variance Std.Dev.
 PartID_unique:GroupID (Intercept) 1.757    1.326   
 GroupID               (Intercept) 5.415    2.327   
 Residual                          1.470    1.212   
Number of obs: 42, groups:  PartID_unique:GroupID, 21; GroupID, 7

Fixed effects:
                Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)      13.5046     0.9630  6.4768   14.02 4.35e-06 ***
TalkToBiasedYes  -1.8556     0.3741 20.0000   -4.96 7.54e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
TalkToBsdYs -0.194

3.4.2 Predictors of adaptability

3.4.2.1 Our measures of adaptability

We compute three measures of adaptability based on the ratio of letter frequency:

  • Adapt1. The first one reflects, for each participant, the difference between the frequency of biased letter in the FirstTesting and the frequency of biased letter in the SecondTesting. A larger number indicates a greater decrease in the number of biased letters in the participants’ final vocabulary (so the participant adapted to the peculiarities of the biased participant). Conversely, a negative score for adaptability means that the person has increased the number of biased letters (so the participant did not adapt to the peculiarities of the biased participant).

  • Adapt2. The measure is very similar to the first one, except that here we do not look at the First and Second Testing, but at all interactions with the biased participants made from Round 1 to 9.

  • Adapt3. This measure is the most complex, and the most likely to efficiently represent adaptability. We arbitrarily created a score based on a decision tree, see image below:

  • Adapt4. This is similar to Adapt3, except that here, we look at the normalized distance between:

    • d1: the new item produced and the item the partner has said

    • d2: the new item produced and the last item produced by the participant.

      Then, we computed the Adapt4 as 2*(1-d1) + d2 (see below)

We look at this only in heterogenous groups for Adapt1 and Adapt2, and for both groups (heterogeneous and control) in Adapt3.

Do these measures correlate with each other, and how are they distributed?

**Figure 34.** Correlation between the adaptability scores.

Figure 34. Correlation between the adaptability scores.

The correlation between the measure for Adaptability 1 and 2 is quite low, while it should be higher.

3.4.2.2 Summary predictors

However, we will still try to observe how our different metrics (such as age, gender…: see Method) correlate with the adaptability scores. First, we look at the variable distribution.

**Figure 35.** Distribution of the scores. Red shows the results for the control group, blue shows the results for the heterogenous group.

Figure 35. Distribution of the scores. Red shows the results for the control group, blue shows the results for the heterogenous group.

Just a reminder:

  • the global working memory is the overall inverse efficiency (reaction time divided by accuracy) in the task-switching experiment, only in letters+numbers task: the higher, the worst the workig memory
  • the cognitive flexibility is measured in the letters+numbers task in the task switching experiment. It is the normalized difference (by time) of the time needed to switch between two tasks and the time needed when performing the same task: the higher, the worst the cognitive flexibility

Let’s look at the correlations between the numerical variables:

**Figure 36.** Correlation between the predictors.

Figure 36. Correlation between the predictors.

3.4.2.3 Predicting adaptation

But, in order to see more clearly (and also add the effect of gender, which we could not see in the plot above, we will have a closer look at each of our Adaptability measures.

3.4.2.3.1 Adaptability 1

We first look at the measure using the score of Adaptability1:

(please note that this variable only includes participants from heterogenous groups)

**Figure 37.** Correlations between the different individual measures and our score for Adaptability1. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Figure 37. Correlations between the different individual measures and our score for Adaptability1. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Model:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability1 ~ prosoc_z + WorkingMem_z + CogFlexibility_z +  
    DictatorGame_z + Age_z + Gender_z + AccLearning_z + ProducSim_z +  
    (1 | GroupNum)
   Data: df_other2_z

REML criterion at convergence: 94.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.19689 -0.50159 -0.08027  0.57271  0.91707 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupNum (Intercept) 31.66    5.627   
 Residual             30.88    5.557   
Number of obs: 21, groups:  GroupNum, 7

Fixed effects:
                 Estimate Std. Error      df t value Pr(>|t|)
(Intercept)        4.1033     2.7219  3.5991   1.507    0.214
prosoc_z          -3.8259     4.4046 10.6155  -0.869    0.404
WorkingMem_z       1.5921     4.6609 11.9684   0.342    0.739
CogFlexibility_z  -1.5782     6.3158  8.2278  -0.250    0.809
DictatorGame_z    -2.0221     2.0809 11.7485  -0.972    0.351
Age_z             -0.2419     0.5957 11.3888  -0.406    0.692
Gender_z          -0.1714     4.3611  6.9454  -0.039    0.970
AccLearning_z     -8.2086    17.1805  9.7835  -0.478    0.643
ProducSim_z       10.6719    22.3196 10.7742   0.478    0.642

Correlation of Fixed Effects:
            (Intr) prsc_z WrknM_ CgFlx_ DcttG_ Age_z  Gndr_z AccLr_
prosoc_z     0.224                                                 
WorkingMm_z -0.165 -0.294                                          
CgFlxblty_z -0.032  0.181  0.416                                   
DictatrGm_z  0.132  0.206 -0.352 -0.171                            
Age_z        0.240  0.735 -0.064  0.239 -0.070                     
Gender_z    -0.396 -0.540  0.436  0.016 -0.077 -0.580              
AccLernng_z  0.034  0.036  0.393  0.113  0.110  0.169  0.254       
ProducSim_z -0.085 -0.231 -0.071 -0.003 -0.144 -0.295 -0.045 -0.892
            R2m       R2c
[1,] 0.06919029 0.5404263

R²m focuses solely on the fixed effects’ contribution to explaining variance, while R²c takes into account both the fixed effects and the random effects.

3.4.2.3.2 Adaptability 2

We look at the exact same plots using the score for Adaptability2:

**Figure 38.** Correlations between the different individual measures and our score for Adaptability2. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Figure 38. Correlations between the different individual measures and our score for Adaptability2. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Model:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability1 ~ prosoc_z + WorkingMem_z + CogFlexibility_z +  
    DictatorGame_z + Age_z + Gender_z + AccLearning_z + ProducSim_z +  
    (1 | GroupID)
   Data: df_other2_z

REML criterion at convergence: 94.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.19689 -0.50159 -0.08027  0.57271  0.91707 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 31.66    5.627   
 Residual             30.88    5.557   
Number of obs: 21, groups:  GroupID, 7

Fixed effects:
                 Estimate Std. Error      df t value Pr(>|t|)
(Intercept)        4.1033     2.7219  3.5991   1.507    0.214
prosoc_z          -3.8259     4.4046 10.6155  -0.869    0.404
WorkingMem_z       1.5921     4.6609 11.9684   0.342    0.739
CogFlexibility_z  -1.5782     6.3158  8.2278  -0.250    0.809
DictatorGame_z    -2.0221     2.0809 11.7485  -0.972    0.351
Age_z             -0.2419     0.5957 11.3888  -0.406    0.692
Gender_z          -0.1714     4.3611  6.9454  -0.039    0.970
AccLearning_z     -8.2086    17.1805  9.7835  -0.478    0.643
ProducSim_z       10.6719    22.3196 10.7742   0.478    0.642

Correlation of Fixed Effects:
            (Intr) prsc_z WrknM_ CgFlx_ DcttG_ Age_z  Gndr_z AccLr_
prosoc_z     0.224                                                 
WorkingMm_z -0.165 -0.294                                          
CgFlxblty_z -0.032  0.181  0.416                                   
DictatrGm_z  0.132  0.206 -0.352 -0.171                            
Age_z        0.240  0.735 -0.064  0.239 -0.070                     
Gender_z    -0.396 -0.540  0.436  0.016 -0.077 -0.580              
AccLernng_z  0.034  0.036  0.393  0.113  0.110  0.169  0.254       
ProducSim_z -0.085 -0.231 -0.071 -0.003 -0.144 -0.295 -0.045 -0.892
            R2m       R2c
[1,] 0.06919029 0.5404263

3.4.2.3.3 Adaptability 3

We look at the exact same plots using the score for Adaptability3:

**Figure 39.** Correlations between the different individual measures and our score for Adaptability3. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Figure 39. Correlations between the different individual measures and our score for Adaptability3. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Model, including production from the biased participant:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability2 ~ prosoc_z + WorkingMem_z + CogFlexibility_z +  
    DictatorGame_z + Age_z + Gender_z + AccLearning_z + ProducSim_z +  
    (1 | GroupID)
   Data: df_other_z

REML criterion at convergence: 75.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.29011 -0.44312 -0.05865  0.32637  1.59578 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept)  0.0     0.000   
 Residual             10.7     3.271   
Number of obs: 21, groups:  GroupID, 7

Fixed effects:
                  Estimate Std. Error        df t value Pr(>|t|)   
(Intercept)        3.29736    0.97015  12.00000   3.399  0.00528 **
prosoc_z          -1.85364    2.05854  12.00000  -0.900  0.38559   
WorkingMem_z       1.57906    1.97838  12.00000   0.798  0.44029   
CogFlexibility_z   0.41123    3.21404  12.00000   0.128  0.90031   
DictatorGame_z    -1.41578    0.95146  12.00000  -1.488  0.16255   
Age_z              0.03073    0.28080  12.00000   0.109  0.91465   
Gender_z          -0.33332    2.28204  12.00000  -0.146  0.88630   
AccLearning_z    -17.20183    8.33325  12.00000  -2.064  0.06131 . 
ProducSim_z       25.30525   10.46976  12.00000   2.417  0.03250 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prsc_z WrknM_ CgFlx_ DcttG_ Age_z  Gndr_z AccLr_
prosoc_z     0.332                                                 
WorkingMm_z -0.216 -0.200                                          
CgFlxblty_z  0.029  0.318  0.238                                   
DictatrGm_z  0.245  0.283 -0.201 -0.062                            
Age_z        0.425  0.750 -0.093  0.265  0.259                     
Gender_z    -0.612 -0.522  0.354 -0.165 -0.124 -0.631              
AccLernng_z  0.135 -0.061  0.230 -0.156  0.316  0.074  0.178       
ProducSim_z -0.169 -0.003  0.024  0.212 -0.303 -0.119 -0.043 -0.911
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
           R2m       R2c
[1,] 0.3327252 0.3327252

Model, excluding production from the biased participant:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability3 ~ prosoc_z + WorkingMem_z + CogFlexibility_z +  
    DictatorGame_z + Age_z + Gender_z + AccLearning_z + ProducSim_z +  
    (1 | GroupID)
   Data: df_other_z[!(df_other_z$PartID == 1 & df_other_z$GroupType ==  
    "Hetero"), ]

REML criterion at convergence: -33

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.57561 -0.51577 -0.07565  0.64390  1.84356 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.003468 0.05889 
 Residual             0.012497 0.11179 
Number of obs: 49, groups:  GroupID, 14

Fixed effects:
                   Estimate Std. Error         df t value Pr(>|t|)   
(Intercept)      -0.1026441  0.0262396 17.1561746  -3.912  0.00111 **
prosoc_z          0.0303311  0.0350553 35.9406462   0.865  0.39265   
WorkingMem_z     -0.0475077  0.0425451 38.0386109  -1.117  0.27115   
CogFlexibility_z  0.0618350  0.0916649 35.7784849   0.675  0.50428   
DictatorGame_z    0.0009374  0.0272844 39.0750590   0.034  0.97277   
Age_z             0.0005543  0.0041543 39.2380299   0.133  0.89453   
Gender_z          0.0286825  0.0401149 35.0562824   0.715  0.47934   
AccLearning_z     0.0452152  0.1827187 37.6833403   0.247  0.80590   
ProducSim_z       0.0260163  0.2379260 38.4265997   0.109  0.91350   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prsc_z WrknM_ CgFlx_ DcttG_ Age_z  Gndr_z AccLr_
prosoc_z    -0.054                                                 
WorkingMm_z  0.046  0.102                                          
CgFlxblty_z -0.028  0.260  0.214                                   
DictatrGm_z  0.033 -0.045 -0.069 -0.092                            
Age_z        0.148  0.084  0.041 -0.013 -0.186                     
Gender_z    -0.473  0.106 -0.058  0.038  0.038 -0.218              
AccLernng_z  0.090 -0.112  0.217 -0.008  0.210 -0.004  0.201       
ProducSim_z -0.139  0.136 -0.043  0.038 -0.106 -0.035 -0.119 -0.908
           R2m       R2c
[1,] 0.0898214 0.2875562

3.4.2.3.4 Adaptability 4

We look at the exact same plots using the score for Adaptability4:

**Figure 40.** Correlations between the different individual measures and our score for Adaptability4. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Figure 40. Correlations between the different individual measures and our score for Adaptability4. Reminder: in the dictator game 1 is keeping all the money for oneself, 5 is to share it all.

Model including the production from the biased participant:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability4 ~ prosoc_z + WorkingMem_z + CogFlexibility_z +  
    DictatorGame_z + Age_z + Gender_z + AccLearning_z + ProducSim_z +  
    (1 | GroupID)
   Data: df_other_z

REML criterion at convergence: 2.3

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.97671 -0.52383  0.09248  0.63485  1.72760 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.02724  0.1650  
 Residual             0.02452  0.1566  
Number of obs: 56, groups:  GroupID, 14

Fixed effects:
                   Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)       1.6743796  0.0520384 15.3073297  32.176 1.77e-15 ***
prosoc_z         -0.0215632  0.0484942 38.5936249  -0.445   0.6591    
WorkingMem_z      0.0124828  0.0564002 41.9955380   0.221   0.8259    
CogFlexibility_z -0.0007215  0.1263691 37.2664798  -0.006   0.9955    
DictatorGame_z   -0.0082334  0.0395368 41.4556476  -0.208   0.8361    
Age_z            -0.0017282  0.0051478 37.4816895  -0.336   0.7390    
Gender_z          0.0053707  0.0560555 39.1659955   0.096   0.9242    
AccLearning_z    -0.3796356  0.1880709 39.5658058  -2.019   0.0503 .  
ProducSim_z       0.5098609  0.2441742 39.2527811   2.088   0.0433 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prsc_z WrknM_ CgFlx_ DcttG_ Age_z  Gndr_z AccLr_
prosoc_z    -0.051                                                 
WorkingMm_z  0.002  0.121                                          
CgFlxblty_z -0.031  0.203  0.334                                   
DictatrGm_z  0.013  0.032 -0.073 -0.069                            
Age_z        0.016 -0.036 -0.121 -0.033 -0.156                     
Gender_z    -0.346  0.148 -0.005  0.088 -0.038 -0.046              
AccLernng_z -0.086 -0.132  0.188  0.115  0.127 -0.149  0.247       
ProducSim_z  0.042  0.188  0.008 -0.048  0.027  0.093 -0.122 -0.839
            R2m       R2c
[1,] 0.06788929 0.5583678
        prosoc_z     WorkingMem_z CogFlexibility_z   DictatorGame_z 
        1.130551         1.362975         1.178453         1.182312 
           Age_z         Gender_z    AccLearning_z      ProducSim_z 
        1.058726         1.176554         4.852170         4.322597 

Model excluding productions from the biased participants in heterogenous groups:

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Adaptability4 ~ prosoc + WorkingMem + CogFlexibility + DictatorGame +  
    Age + Gender + AccLearning + ProducSim + (1 | GroupID)
   Data: df_other_z[!(df_other_z$PartID == 1 & df_other_z$GroupType ==  
    "Hetero"), ]

REML criterion at convergence: -1.8

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.53369 -0.62004  0.06346  0.62015  1.46774 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.02709  0.1646  
 Residual             0.01983  0.1408  
Number of obs: 49, groups:  GroupID, 14

Fixed effects:
                 Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)     1.9164854  0.3047512 31.0000785   6.289 5.41e-07 ***
prosoc         -0.0271676  0.0466917 30.8501940  -0.582    0.565    
WorkingMem     -0.0284118  0.0574314 31.5309274  -0.495    0.624    
CogFlexibility  0.0175435  0.1218276 30.6041833   0.144    0.886    
DictatorGame    0.0037392  0.0394754 36.7633587   0.095    0.925    
Age            -0.0004194  0.0057419 33.9505566  -0.073    0.942    
GenderM         0.0248748  0.0530205 30.1769781   0.469    0.642    
AccLearning     0.0290540  0.2463463 31.5907232   0.118    0.907    
ProducSim      -0.1360715  0.3232488 32.1749557  -0.421    0.677    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) prosoc WrkngM CgFlxb DcttrG Age    GendrM AccLrn
prosoc      -0.677                                                 
WorkingMem  -0.447  0.095                                          
CogFlexblty -0.330  0.271  0.228                                   
DictatorGam -0.211 -0.013 -0.078 -0.081                            
Age         -0.335  0.030  0.049 -0.017 -0.274                     
GenderM     -0.025  0.099 -0.063  0.054  0.057 -0.164              
AccLearning  0.127 -0.063  0.213  0.040  0.216  0.019  0.229       
ProducSim   -0.324  0.078 -0.037 -0.007 -0.091 -0.063 -0.152 -0.908
           R2m       R2c
[1,] 0.0167854 0.5844647
        prosoc     WorkingMem CogFlexibility   DictatorGame            Age 
      1.110681       1.436756       1.140272       1.312834       1.131158 
        Gender    AccLearning      ProducSim 
      1.180877       9.059043       7.886534 

3.4.2.4 Other analysis: predictors, adaptability

Let’s have a look at how the Adaptability score of participants evolve with time:

Let’s observe whether a higher adaptability correlates with a better communicative success in average:

It appears that communicative success is correlated with each participant’s adaptability scores. Is this relationship significant? To determine whether to use a Spearman or Pearson correlation test, we first assess whether the relationship between the two variables is linear. Since both a linear and quadratic model fit the data equally well, we applied both Pearson and Spearman correlation tests.


    Pearson's product-moment correlation

data:  df_cor$CommunicativeSuccess and df_cor$Adaptability4
t = 4.4871, df = 54, p-value = 3.82e-05
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2992439 0.6895684
sample estimates:
      cor 
0.5211453 

    Spearman's rank correlation rho

data:  df_cor$CommunicativeSuccess and df_cor$Adaptability4
S = 14529, p-value = 7.647e-05
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5034611 

Now, we want to analyze whether the communicative success has also positively impacted the measures of prosociality, since these measures were performed after the main experiment, maybe it had an effect.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: prosoc ~ MeanComSuccess + (1 | GroupID)
   Data: df_both2

REML criterion at convergence: 89.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.4324 -0.3436  0.0968  0.6252  1.3861 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.02683  0.1638  
 Residual             0.26187  0.5117  
Number of obs: 56, groups:  GroupID, 14

Fixed effects:
               Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)     3.74048    0.32618 16.17875  11.467  3.5e-09 ***
MeanComSuccess -0.03742    0.45121 16.49485  -0.083    0.935    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
MeanCmSccss -0.969
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: DictatorGame ~ MeanComSuccess + (1 | GroupID)
   Data: df_both2

REML criterion at convergence: 114.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.69756  0.02422  0.23531  0.33410  2.92745 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.08226  0.2868  
 Residual             0.38975  0.6243  
Number of obs: 56, groups:  GroupID, 14

Fixed effects:
               Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)      3.2857     0.4461 16.3870   7.366 1.38e-06 ***
MeanComSuccess  -0.7142     0.6162 16.8276  -1.159    0.263    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr)
MeanCmSccss -0.967
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: prosoc ~ ACC + (1 | PartID) + (1 | GroupID)
   Data: df3

REML criterion at convergence: 10103.3

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.95153 -0.52747  0.09698  0.72228  1.89562 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.085799 0.29292 
 PartID   (Intercept) 0.008744 0.09351 
 Residual             0.189610 0.43544 
Number of obs: 8512, groups:  GroupID, 14; PartID, 4

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)  3.715e+00  9.161e-02  1.553e+01  40.555   <2e-16 ***
ACC         -1.492e-03  1.088e-02  8.501e+03  -0.137    0.891    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
    (Intr)
ACC -0.082
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: DictatorGame ~ ACC + (1 | PartID) + (1 | GroupID)
   Data: df3

REML criterion at convergence: 13961.5

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.67391 -0.19398  0.06182  0.34439  2.66996 

Random effects:
 Groups   Name        Variance Std.Dev.
 GroupID  (Intercept) 0.160969 0.40121 
 PartID   (Intercept) 0.006625 0.08139 
 Residual             0.298369 0.54623 
Number of obs: 8512, groups:  GroupID, 14; PartID, 4

Fixed effects:
              Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)    2.82868    0.11523   15.76121  24.549 5.52e-14 ***
ACC           -0.06245    0.01365 8500.42792  -4.575 4.83e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
    (Intr)
ACC -0.082

4 Chapter plots

**Figure 41.** Figure 1 paper

Figure 41. Figure 1 paper

**Figure 43.** Figure 2 paper.

Figure 43. Figure 2 paper.